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Abstract We develop a cavity-based method which allows to extract thermodynamic properties from posi¬ 
tion information in hard-sphere/disk systems. So far, there are available-volume and free-volume methods. 

We add a third one, which we call available-volume-after-takeout, and which is shown to be mathematically 
equivalent to the others. In applications, where data sets are finite, all three methods show limitations, and 
they do this in different parameter ranges. We illustrate the principal equivalence and the limitations on 
data from molecular dynamics - In particular, we test robustness against missing data. We have in mind 
experimental limitations where there is a small polydispersity, say 4% in the particle radii, but individual 
radii cannot be determined. We observe that, depending on the used method, the errors in such a situation 
are easily 100% for the pressure and 10 kT for the chemical potentials. Our work is meant as guideline 
to the experimentalist for choosing the right one of the three methods, in order to keep the outcome of 
experimental data analysis meaningful. 
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1 Introduction 

Recent years have seen a growing number of experiments 
which use real-space measurements to extract quantita¬ 
tive information on thermodynamics and on dynamics of 
colloids [1,2,3,4,5]. Some of the investigated systems are 
crystalline [6,7,8,9,10,11], some are disordered and more 
or less dense [12,13,14,15,16,17]. Extracting a full ther¬ 
modynamic description out of the available data is still 
a challenging task. Typical data consist of configuration 
snapshots, for example from video recording, or from con- 
focal microscopy, which provide the positions of all par¬ 
ticles at given times. Any thermodynamic description de¬ 
pends on the choice of the disregarded degrees of freedom. 
In particular, one typically disregards the degrees of free¬ 
dom of a surrounding fluid and describes it in a more or 
less effective way. The challenge in this task thus consists 
in the determination of a set of thermodynamic variables, 
say pressure and chemical potentials, which is consistent, 
that is both quantities are calculated from the same data 
~ for example from the particle centers alone, disregarding 
surrounding fluids. 

A number of experiments aim at hard colloids [4,18, 
14], where the data analysis is the same as for hard-sphere 
simulations. There, simulations can serve for the devel¬ 
opment of good data treatment algorithms and for their 
calibration. On top of the conceptual question how to 
extract a consistent set of thermodynamic quantities, in 
real-world application, also the robustness of these algo¬ 
rithms against noise and missing data is important. Usual 


sources of noise in experiments are limited resolution of po¬ 
sition and size of the particles [19], shape variations, and 
many more. In particular, there seems to be an unavoid¬ 
able amount of size variations (polydispersity) of around 
4% [18,14]. Are the used algorithms for the data analysis 
robust against this variation? 

For hard spheres the determination of pressure and 
chemical potentials reduces to a geometrical problem. One 
family of algorithms to calculate them is based on measur¬ 
ing the space where a particle can be inserted into a given 
configuration. The general scheme is to take a set of con¬ 
figurations, calculate a certain geometrical quantity from 
each, and average it over the configurations. The different 
methods vary in the choice of the geometrical quantity and 
of the configurations. Widom [20] described the available 
volume, that is the volume Vq where yet another particle 
can be inserted into a configuration of N particles. Clearly, 
the available volume may consist of several disconnected 
regions (cavities). Figure la shows an example, where Vq is 
the union of the green volumes. The notion of chemical po¬ 
tentials is directly linked to the insertion of another parti¬ 
cle into a system. The link between Vq and the pressure is 
less obvious. It has been established by Speedy [21] who 
expressed the pair-distribution function g{r) in terms of 
the ratio of averages {So)/{Vo), where the average is done 
over all possible configurations and S'o is the area of the 
surface of Vq. This work made it principally possible to 
calculate the pressure from cavity averages. On a different 
line of reasoning. Hoover et al. [22] expressed the pressure 
on the basis of the region which a particle can explore 
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Figure 1. Some cavities in a given polydisperse conhguration 
of hard disks. Dotted circles indicate the excluded volume for 
the center of the particle in question, (a) Available volume (in 
green) for an additional particle, (b) Free volume (blue) of the 
indicated particle (hatched). The new AVATO average uses the 
union of the free volume and the remaining cavities (red). 


when all others remain fixed. This cavity, which is called 
free volume, is another geometrical quantity which can be 
extracted from a given configuration.^ An example is given 
in fig. lb. Notice that if a different particle is chosen, the 
resulting free volume is different, and the free volumes of 
two particles may overlap. Clearly, the free volume has ad¬ 
vantages over the available volume in dense systems, where 
adding yet another particle might be impossible. The free 
volume, however, never vanishes, because we first take out 
the particle in question, then determine the cavity where 
we can put it back in. Hoover et al. used a dynamical argu¬ 
ment for their formula for the pressure. Speedy [25] found 
the same formula by showing that the above ratio of aver¬ 
ages equals an average of a ratio, {Sf/Vf), where Vf is the 
free volume, Sf its boundary. Some years later. Speedy [26, 


^ There is another branch of cavity-based method which goes 
back to Kirkwood [23] and which has been specialized to hard 
spheres by Wood [24]. This method also uses the term free 
volume, but it should not be confused with the free-volume 
method presented here. The essential difference is that in the 
Wood/Kirkwood method, the cavities are extracted from one 
averaged snapshot, where the averaging may have destroyed 
the non-overlapping constraint of hard spheres. Here, cavities 
are taken from many snapshots which are all compatible with 
the constraint, and the average is done subsequently. 


27] found a way to re-derive the pressure result without re¬ 
course to the pair distribution function, by only counting 
configurations and applying elementary thermodynamics. 
This approach is more precise on how the averages are 
calculated. Interestingly, in order to get the pressure of 
N particles, one needs averages over ensembles of {N—1) 
spheres. This difference is the key for the introduction of 
free-volume methods, where configurations of one particle 
less are regarded. Taking this difference seriously allows 
to provide algorithms which work seamlessly in all cases, 
being dilute, dense disordered, or crystalline. 

If the particles have different sizes, the precise way 
how to average over cavities changes. Instead of a single 
available volume, we now have to calculate one for each 
radius of particle we want to insert. Finally, pressure and 
chemical potentials are weighted averages over these avail¬ 
able volumes. How to calculate this average was shown 
by Corti & Bowles [28]. They generalised Speedy’s work 
that uses pair-distribution functions to the polydisperse 
case. They did not use the cleaner configuration counting 
approach. 

In the present paper we enrich the family of cavity- 
based algorithms by one member. To the methods based 
on available-volume (av) and on free-volume (fv), we add 
a third one, which we call availahle-volume-after-takeout 
(avato). Its derivation below makes the AVATO method 
appear as the natural extension of the configuration count¬ 
ing idea, when passing from averages over (A^—1) particles 
to those over N particles - similarly to the free-volume 
method, but conceptually simpler. It is mathematically 
equivalent to the other two methods if averages over all 
configurations are available. As does the FV method, also 
the AVATO-average is possible in dense configurations. We 
provide formulae and algorithms for calculating both the 
pressure and the chemical potentials in all three methods, 
in the presence of polydispersity. For the two established 
methods, AV and fv, most of the equations have already 
been published and implemented [29,30,31,32] - but not 
for all the combinations of chosen method, pressure, chem¬ 
ical potential and polydispersity. For completeness we give 
all the equations here. 

The second contribution of the present paper is to ap¬ 
ply the derived algorithms to numerical data in two dimen¬ 
sions. We reproduce their principal equivalence and show 
how they start to differ on finite sets of data. We further 
test their robustness if we throw away information on indi¬ 
vidual particle radii at small values of polydispersity. The 
provided comparisons should guide experimentalists work¬ 
ing with colloids, such that they can avoid the indicated 
problems and choose the best algorithm for analysing their 
data. 

The outline of the paper is as follows. In sects. 2.1 
and 2.2 we recall the established cavity methods and de¬ 
fine some notation, restricting ourselves to the monodis- 
perse case. The avato average is derived in sect. 2.3. The 
expressions for pressure and for chemical potentials in the 
fully polydisperse case follow in sects. 2.4 and 2.5. Numer¬ 
ical results of all three methods are presented for pressure 
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(sect. 3.2) and for chemical potentials (sect. 3.3). The ef¬ 
fect of missing radius information is shown in sect. 3.4. 

2 Cavity methods 

We now present the established cavity methods and derive 
our new method in the monodisperse case. We will also 
define the necessary ensembles and averages. Consider a 
collection of N spheres/disks of diameter a in d dimen¬ 
sions, enclosed in a volume V at temperature T. In order 
to exclude surface effects, we take the volume to be peri¬ 
odic. 

2.1 Available volume 

Speedy [26] arrived at an expression for the pressure in the 
monodisperse case, using only counting arguments. His 
result is the ratio of two averages, 

pC _ . I P ... 

Here, f2{N—l) is the set of all possible ways to place 
(iV—1) spheres in the volume, and {•)o{n-i) is the av¬ 
erage over this set. In order to render the set finite, we 
may think of space being cut into small pixels of vol¬ 
ume to and observe that the Hnal expression for the pres¬ 
sure does not depend on uj. We denote one such con¬ 
figuration (or “state”) by the centers of the spheres, 
k G n{N—l) : k = {ri,... rAr„i}. Despite every particle 
having a unique number, they are not discerned within 
a state. The available volume Vo{k) is then the volume 
where the center of an additional particle, tat in the above 
case, can be placed. See Hg. la for an example. The avail¬ 
able volume may fall into several disjoint connected re¬ 
gions, called cavities. So{k) is the boundary of the avail¬ 
able volume. 

The formula (1) is not directly applicable in data anal¬ 
ysis, for two reasons: First, if we have data from an ex¬ 
periment or a simulation with N particles, we never have 
access to strictly all conhgurations. We rather take a H- 
nite series of K snapshots, S{N), and average over those. 
S{N) represents the data which is really available from 
an experiment. The number of snapshots, K, introduces 
a first level of approximation, which becomes exact in the 
limit AT —> oo. We will therefore denote this approximation 
as an equality. The second problem is that we have snap¬ 
shots of N particles, not of (N—l) particles. This prob¬ 
lem gives rise to the free-volume and the available-volume- 
after-takeout methods below. Here, it simply introduces 
another level of approximation. 

Applying eq. (1) to real-world data, we approximate 
the averages by those which we have, namely over the 
snapshots S{N), which contain all particle centers (and 
radii) at given times, 

pV ^ g ('^o(^))fce5(jV) , , 


Equation (2) can be translated into the following algo¬ 
rithm: 

Loop over the snapshots k G S{N): 

Enlarge all sphere diameters by g. 

Find the space not covered by any sphere (cavities). 

Vo{k) ^ sum of all cavity volumes. 

So{k) ^ sum of all cavity boundaries. 

Accumulate S'o(fc) and Vb(^) in independent averages. 

End loop over k. 

The enlargement step takes care of the excluded volume 
of both the present particles and the additionally inserted 
one, see fig. 1. The cavities are for the center of the in¬ 
serted particle only. The algorithm to find the volumes 
and boundaries which are not covered by any sphere is 
discussed in sect. 3.1. 

Effect of finite N and K 

Generally speaking, in the limit N ^ oo and iG —)■ oo 
equation (2) and similar expressions in the following sec¬ 
tions become exact. In practice both parameters are Hnite, 
and this introduces deviations in the approximations. The 
precise nature of the deviations is subtle, they depend on 
the number density N/V and on the quantity that is be¬ 
ing averaged. We now try to capture some aspects of the 
deviations, without being exhaustive. 

Let us focus on the individual effect of finite N first, 
assuming K to cover all possible configurations. This is as 
if we replaced S{N) by f2{N) in eq. (2). Still, this equa¬ 
tion aims at calculating the pressure of a different sys¬ 
tem than the original one, namely p{N-\-l,V,T) instead 
of p{N, V, T). There is a (small) error in the density of the 
order 1/A^ which translates into an error in the pressure. 
We can neglect it in the limit of many particles. 

Now, if K is finite, everything depends on the concrete 
configurations contained in S{N). Above all we need the 
cavities to be sufficiently numerous to build reliable aver¬ 
ages. If the particle density is sufficiently small, this is the 
case because we can nearly always insert another particle. 
The density deviation described above is thus the only sys¬ 
tematic error and can be controlled by choosing N large 
enough. The finite number of snapshots introduces addi¬ 
tional random noise. Surely, AT —> oo will make this noise 
disappear, but this does not imply that the rate of con¬ 
vergence is sufficient for practical applications. This point 
will remain open in the present paper; we would only like 
to mention that there are examples in the literature [33] 
where the quantity of interest converges so slowly that 
advanced extrapolation methods are required. 

The influence of finite K and N is more subtle for crys¬ 
talline systems. It may happen that not a single snapshot 
in the given set S{N) allows insertion of another particle. 
In such a case we cannot calculate the average {•)k&siN)- 
(What we said above in the exact limit AT —>■ oo remains 
valid, however, because we will find at least one extensi¬ 
ble configuration in f2{N) if N is large enough for the 
given density.) In order to see cavities in a monocrystal of 
N particles, we rely on spontaneous fluctuations to make 
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sufficient space for particle insertion. Their rate of appear¬ 
ance is a function of the density N/V and of the total 
number N and becomes exponentially small at high den¬ 
sities [27,34]. In practice, where K is limited to a few 
thousand, we are not astonished to see few or no cavities 
in the snapshots and to see badly converged averages. 

For computer simulations one way out of this problem 
is to take 1) from the beginning, that is to simulate a 

crystal with a vacancy [35]. This guarantees that the num¬ 
ber of cavities is around one even in the densest system. 
In fact, this gives directly an approximation to eq. (I). We 
will not pursue this idea further in the present paper, be¬ 
cause it is rather limited to computer simulations, and we 
here focus on what can be done with experimental data, 
taking simulations only as a test ground. Notice however 
that the number of cavities, Nc{k), is a nice example for a 
function that does not give the same value when averaged 
over n{N) and over n{N—l). 


2.2 Free volume 

A well-established way for the above problem, passing 
from averages over iV—1 particles to N particles, but with¬ 
out changing the system, is to introduce the so-called 
free-volume averages [25,27,28,30]. This method allows 
to measure pressures even in crystalline or nearly jammed 
systems. 

From a configuration k G n(N) we choose one par¬ 
ticle at Ti, take it out, and such construct the reduced 
state k\ri G f2(N—l). The resulting available volume 
Vo(k\r i) is nonzero. One of its cavities contains the point 
This cavity is called the free volume, denoted by Vf {k\ri). 
It is depicted in blue in hg. lb. In terms of free volumes, 
eq. (1) reads 


NkT 2d\Vf / 


( 3 ) 


where the ratio of averages has turned into the average 
of a ratio. The precise dehnition of J^{N) requires going 
into greater detail, which is done in several steps in the 
remainder of this section. The passage from (1) to (3) has 
been described several times in the literature [25,27,28, 
30]. We find it worth summarising these references in order 
to make the difference to our new method clear, which will 
be introduced in sect. 2.3. 

In a first step, one constructs the set C(iV—1) of all pos¬ 
sible cavities which can be obtained from 17(7V—1). The 
ratio of averages has not changed at this point, and both 
are counting averages, 

{bibWjlT ^ T(M)> ik,l)eC{N-l) 

Here, we labelled every cavity by both the configuration k 
in which it occurs and a number I = 1,..., iVc(fc), where 
Nc{k) is the number of cavities in this configuration. Vc{k, 1) 
denotes the volume of the cavity, and Sc(k, 1) its boundary. 


The next step introduces a probabilistic element in the 
averages used in eqs. 4. Instead of averaging over all pos¬ 
sible cavities, one chooses cavities at random with a prob¬ 
ability proportional to their volume Vc(k,l) [30], 

P{k,l):=V,{k,l) / Y. yc{k',l'). (5) 

{k' ,l')^C{N-l) 

The corresponding average of a quantity f{k, 1) is denoted 

by 

{f)r(N-i)--= E mi)f{k,l). (6) 

(k,l)&e(N-l) 

The ratio of averages in eq. (1) now becomes an average 
of ratios. 


^ / Scik,l) \ 

(w(fc, O)c(jv-I) \ bc(^, 0/■p(Ar-i) 

In a last step one passes from (IV—1) to N spheres, which 
defines the average over iF{N) as the following procedure: 
One chooses uniformly a configuration k G f2{N) and then 
again uniformly one of the spheres i G {!,... iV}. This 
latter choice can be repeated many times without chang¬ 
ing the probability space, such that in the end it is equal 
to deterministically choosing every sphere once. In order 
to prove the equivalence of the averages over iF{N) and 
over 7^(A^—I), one has to show that the cavity probabil¬ 
ity P{k', 1) {k' G f2(JV—l)) leads indeed to a homogeneous 
distribution of conhgurations k G f2{N), and vice versa. 

On the algorithmic level, what eq. (3) means in the 
analysis of a series of snapshots is the following: 

Loop over the snapshots k G S{N): 

Loop over all particles i: 

Take out particle i. 

Increase all other diameters by Ui. 

Find the space not covered by any sphere (cavities). 
Identify the cavity which contains the center r^. 
Vf{k\ri) G- vol ume of this cavity. 

Sf(^k\ri) G- boundary of this cavity. 

Accumulate Sf/Vf over both loops. 

End loop over i. 

End loop over k. 
or as a formula, 

- 1 = —f8l 

NkT 

We write eq. (8) as an identity, which is strictly valid only 
in the limit of an infinite number of snapshots. 


2.3 Available volume after take-out 

We will now develop a third, new method which turns 
out to be an alternative to the available-volume and free- 
volume averages. 
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Let us start with a naive algorithmic question: In the 
above algorithm, why not take the whole available volume 
of the reduced state instead of only one of its cavities? 
In the configuration of fig. lb we would take the union 
of the blue and the red cavities. We call this union vol¬ 
ume the available volume after take-out (avato), denoted 
by Vo(^\ri). An average over this volume inherits the ad¬ 
vantage from the FV-average that even at high densities 
there is always at least one cavity to calculate. With the 
Av-average it shares the advantage that there are more 
than one cavity, possibly many, which contribute when 
the system is not dense. This improves the stability of the 
averages. In this sense the AVATO-average combines the 
best of two worlds. 

Using this AVATO-average, the pressure is calculated 

by 



The answer to the above question is affirmative: Equa¬ 
tion (9) is entirely equivalent to eq. (I). Even better, this 
equivalence is based only on the counting of states, no ad¬ 
ditional probability is required. Let us start with all con¬ 
figurations k' G If a state k' is extensible, it can 

be the result of taking one particle out of a configuration 
k € fI{N). In fact, there are exactly VQ{k')/ijV different 
such states k which lead to k'. If a state k' is inextensible, 
we have Vo{k') = 0. In the reverse direction, starting with 
all configurations k G for each of them there are ex¬ 

actly N ways to produce an extensible state k'. The such 
constructed mapping between states k and k' is far from 
being one-to-one, but the whole n{N) is mapped to the 
extensible states, a subset of 1). This means that 

for any function / : f2(fV—I) —> R we have the equality 

k&n{N) i=l k'eO(N-l) 


quantity is different. 

Loop over the snapshots k G S{N): 

Loop over all particles i: 

Take out particle i. 

Increase all other diameters by Ui. 

Find the space not covered by any sphere (cavities). 
Vb(A:\ri) G- sum of all cavity volumes. 

5'o(fc\ri) ^ sum of all cavity boundaries. 
Accumulate S'o/Vb over both loops. 

End loop over i. 

End loop over k. 


2.4 Polydisperse case: pressure 

Eor simplicity of the notation, we derive the expressions 
for a binary mixture only, the generalisation to multicom¬ 
ponent systems is evident from the structure of the formu¬ 
lae. The mixture contains Na spheres of diameter aA and 
JVb spheres of a different diameter as- In a configuration 

k= (12) 

all the A-particles can be interchanged among themselves 
without changing the state, and equally the B-particles. 
The set of all possible states is denoted by n{NA,NB)- 
When we allow particles to have different sizes, also the 
available volume starts to depend on the (type of) particle. 
We will write Ug“ for the volume where we can insert the 
center of another particle of type a, and accordingly for 

S OC J/CX 

0 ’ • 

In order to generalize eq. (I) to the polydisperse case, 
we can follow the same arguments as in Ref. [26]. This is 
a straightforward task, but it has not yet been published. 
The result is similar to Corti & Bowles’ eq. (59), only that 
the (A^—1) averages show up explicitly. 


Every extensible state is counted an equal number on both 
sides of the equation. Notice that the weighting Vo(k')/u! 
automatically eliminates the inextensible states from the 
sum on the right-hand side. We may therefore sum over 
all states f?(N—l). Notice further that when we specialize 
eq. (10) to the constant function f(k') = 1, we obtain 
Speedy’s eq. (5) for the number of configurations [26]. The 
equivalence of eqs. (1) and (9) is now shown by writing out 
the averages as sums and then using eq. (10) twice, with 
f{k') = S'o(fc')/Vb(^0 s-nd with f{k') = I. 

When applied to a finite set of snapshots, only the type 
of average changes, as compared to eq. (9), 

1 ^ ^ ^ 5'o(fc\r^) \ 

NkT 2d\N Vb(fc\ri) / kGS{N) 

Again, this approximation becomes exact in the limit of 
infinite snapshots. The corresponding algorithm is nearly 
identical to the one for free volumes, only the averaged 


pV _ , ^ 1 i^o)0{N^-1,N.) 

NkT ~ 2dN ^ 

JV Vo )oiN^-l,N.) 


(13) 


The notation f2{Na—l, N.) stands for f7(Af4—1, Nb) or for 
f2{NA, Nb—1), respectively. The algorithm now includes 
a loop over the particle types, 

Loop over the snapshots k G S{N): 

Loop over the particle types a: 

Enlarge all sphere diameters by ctq. 

Find the space not covered by any sphere (cavities). 
Vif‘{k) 4- sum of all cavity volumes. 

5'q (fc) sum of all cavity boundaries. 

Accumulate S'g (fc) and Vif‘{k) in averages. 

End loop over a. 

End loop over k. 

The algorithm evaluates the average in the available-volume 
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approximation for the pressure: 


pV 1 _ 1 V 

NkT- 

(p/av) 

From the available-volume average in eq. (13) we can 
follow the same arguments as in sect. 2.2 to obtain the 
free-volume average, 


pV _ , ^ 

NkT 


1 

2dN 


^ ^ a 

ol—A,B 






) J^-‘{Na,Nb) 


(14) 


which generalises eq. (3) to the polydisperse case. This 
is Corti & Bowles’ eq. (75). In practice, the free volumes 
are different geometrical objects for each i and each a, 
such that it is simpler to write one sum over all particles 
instead of separate sums over types and particles. We had 
this sum already in eq. (8), which now becomes 


pV _ , ^ 

NkT 


1 ^ /Sf'{k\r,) 

2dN ^^^Avpik\r,) 


(p/fv) 

kGS{NA,NB) 


Here, at and at are diameter and type of particle i, respec¬ 
tively. 

Finally, the arguments of sect. (2.3) hold for every par¬ 
ticle type individually. For each a we obtain an equation 
such as (10), 


E 

kGn{NA,NB) 


E/"(‘W)= E ^ 

*=1 k'Ga{Na-l,N.) 


rik'), (15) 


where now also the averaged quantity depends on a. Using 
the AVATO average, the pressure evaluation from data then 
turns out to be 


pV _ ^ _ 1 / ^0' (fe\u) 


(p/avato) 

keSiNA.NB) 


The generalisation of the analytical formula (9) to the 
polydisperse case is analogous, only that the average is 
done over l7(iV^,7VB). 


2.5 Polydisperse case: chemical potentials 

In continuum thermodynamics, the chemical potentials 
PA, Pb are said to be derivatives of the suitable thermody¬ 
namic potentials with respect to Na,Nb- As these latter 
take discrete values, we have to choose either the upper or 
the lower difference. In agreement with Ref. [27] we choose 
the lower one, because it leads to averages over configura¬ 
tions of (A^—I) particles, consistent with those in eq. (I). 
In a canonical ensemble we have the lower difference of 
Helmholtz’ free energies, 

Pa{T, U, Na, Nb) := + F(T, Na, Nb, U, a^) 

-F{T,NA-l,NB,V,aA,<JB) 

(16) 


In the microcanonical ensemble we have differences of en¬ 
tropies, 

^{E,V,Na,Nb) := -SiE,NA,NB,V,aA,aB) 

+ S{E,NA-l,NB,V,aA,aB). 

. 

The cavity methods analyse only the configuration in¬ 
tegral and disregard the kinetic part of the phase space. 
We thus require the phase space integral to factorise into 
two parts - which is the case both for the canonical parti¬ 
tion function (always), and in the considered hard-sphere 
case also for the microcanonical integral. If we denote as 
V^kin, Tconi the two factors of the phase space integral (syn¬ 
onymously for the canonical and microcanonical cases), 
the differences of eqs. (16) and (17) become (written for 
particle type A), 


PA _ '^kin(A^A —1, Nb) Pconf(A^A —1, Nb) \ 
kT \ (f]^in{NA, Nb) Pcon{{NA,NB) ) 
^ /Al \f2{NA-l,NB)\i<^ \ 
\Q{Na,Nb)\u: ) 

= ln _. 

(^0 )n{NA-l,NB) 


(18) 


Here, £ denotes an arbitrary length scale. The constants 
Aq are the thermal de-Broglie wavelengths, given in the 
canonical case by hj\J2 /k kT nia, and in the microcanoni¬ 
cal case by the indicated ratio of pkin- The Aq, only set the 
origin of the energy axis and play no role in the following. 
Equation (18) is Corti & Bowles’ eq. (48), only that here 
the average over (iV—1) particles becomes explicit. In the 
last step we made use of their relation (30). 

The appearance of the I2(1V—I)-average in eq. (18) 
gives rise to free-volume and AVATO averages, similar to 
the treatment of the pressure in the above sections - with 
the important difference, however, that here we do not 
treat a relative property (a ratio such as S/V), but an abso¬ 
lute one. This will hinder us from eliminating the average 
(■) n(N-i) entirely Concerning the average over Q[N—1), 
we have the same problem as for the pressure, that we can¬ 
not calculate eq. (18) from data snapshots. The best we 
can do for the moment is to use the statistical averages 
we can compute, arriving at 


In 


In 




(^o“(fc)> 


kGS{NA,NB) 


(m/av) 


This approximation aims at evaluating p(N + 1) instead of 
p{N), as we saw already for the pressure in eq. (2). Again, 
the approximation breaks down for systems so dense that 
one cannot insert another particle. 

The free-volume equivalent of eq. (18) is 


PA 

kT 


-InAl 


^A{^/Vf )b^{Na,Nb) 
ken(NA-i,NB) 


(19) 


^ This caveat also leads to the problem mentioned by Sastry 
et al. [30], that “the chemical potential cannot be determined 
from free-volume information alone”, we need also the number 
of cavities. 
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This is Sastry’s eq. (9) [30], generalised to the polydis- 
perse case. Notice the average number of cavities in the 
denominator, which still uses the counting average. When 
we want to turn this equation into an algorithm, again, 
we have to replace (^c^(fc))fegf 2 (Ar^-i.ArB) something 
we can compute, for example (-^,f jv^)’ 

out knowing the error we make. Thus, in data analysis we 
will use 




In • 


Nc 

E 

i=l 


V?(k\i 


kGSiNA,NB) 


{«“> 


(p/fv) 


SiNA,NB) 


Upon increasing density, we will be limited by the denom¬ 
inator, just as in eq. (fi/Av). As soon as most of the states 
are not extensible anymore, the average number of cavities 
vanishes. 

Finally, let us calculate the chemical potentials in terms 
of AVATO averages. Here, the difference between extensi¬ 
ble and inextensible states is essential. We therefore in¬ 
troduce the notation 1 , for those states in 

—1, Nb) which are extensible by a particle of type A. 
Equation (15) does not work with quantities such as f{k') = 
\/V^{k') which are infinite for inextensible states. They 
would formally annihilate the weighting V^{k') which is 
necessary to discard inextensible states from the sum. Us¬ 
ing the restricted summation repairs this problem. 


Na 

E 

k£n(NA,NB) 


jf2+(NA-l,NB)l 


w 


The take-out average of = l/V^ becomes 


( 20 ) 


to unity and does not interfere. In dense systems, how¬ 
ever, we do not know in advance into how many configura¬ 
tions we can insert a particle, knowing only that the factor 
will be larger than one, diverging for the largest possible 
density. The following approximation therefore underesti¬ 
mates the true chemical potential: 


f-^Ct 


In A" 


Nc 


1 

Uo“(fc\rf) 


(^/AVATO-a) 
keS{NA,NB) 


We may try to find different expressions, on the search 
to avoid the problem with the 7 ^ in eq. (23). For example, 
we may average /V^, which is a relative property, thus 
F)A will not intervene. Its take-out average is 




T^(Na,Nb) 


(Nf) 

tvA\ 
V'^O / 


Q(Na-1,Nb) 


n(NA-i,NB) 


1\ 

/ JfA(Na,Nb) 


(24) 


we find a quantity which we have seen already, namely the 
free-volume average of We can thus express eq. (19) 

in yet another way, namely 


MA 

kT 


-InAl 


= In 


NA{Nf/V,^) 


(Nc^) 


T^(Na,Nb) 
n(NA-i,NB) 


(25) 


Of course, this expression suffers from the same limita¬ 
tions concerning the denominator as does eq. (19). It is 
nevertheless interesting to see how we can again replace 
an average of free volumes by an AVATO average. In data 
analysis, we will use the following approximation for the 
denominator in eq. (25), 


4 \ 

/ T^(Na,Nb) 



} kGn{NA,NB) 


f2+{NA-l,NB) 

k'en+{NA-i,NB) 


I I 

(^0 }n(NA-i,NB) 


( 21 ) 


In the last step we introduced the ratio of the total number 
of states and the number of extensible states. 


niNA-i,NB) 

n+{NA-i,NB) 


( 22 ) 


The chemical potentials are then written as 


kT 


-In A" 


In 


U“ 


T‘‘{Na,Nb) 


(23) 


The occurrence of the ratio 7 ^ in this equation is unfortu¬ 
nate. We will not be able to compute it from A^-particle 
snapshots. In dilute systems, where we can insert another 
particle in practically all configurations, this factor is close 




In W 




r^(NA,NB) 


{N^) 


S(Na,Nb) 


^/ N^ik\rf) \ 

^^^ k\Voik\r?)/,,siNA,N, 

{^ks{NA,NB) 


(/t/avato-b) 


3 Numerical method and results 

We now proceed with the comparison of equations (p/Av), 
(p/fv), and (p/AVATO) for the pressure, and of the equa¬ 
tions (p/av), (p/fv), (p/avato-a), (p/avato-b) for the 
chemical potentials by applying them to data. Some of 
these equations are approximations to the real thermody¬ 
namic quantity, so we hope to get from their comparison 
some information about the quality of these approxima¬ 
tions. In order to avoid at best other sources of error, we 
constrain ourselves to snapshots which come from simula¬ 
tions, where we are limited only by the numerical precision 
in the centers and radii of the particles. 
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3.1 Numerical methods 

To generate snapshots, we used event-driven molecular dy¬ 
namics [36] with elastic collisions, such that the statistical 
ensemble is the microcanonical one - without the need 
for a thermostat. In addition to energy conservation, also 
the total linear momentum is conserved during the sim¬ 
ulation. All our simulations use periodic boundary condi¬ 
tions to avoid possible inaccuracies from cutting cavities 
at boundaries. 

The snapshots contain N = 2150 circular disks in two 
dimensions {d = 2) which have all identical mass rrii = 1. 
The diameters of the disks are chosen randomly from a 
Normal distribution with mean 1 and a prescribed stan¬ 
dard deviation, where outliers beyond three standard de¬ 
viations were discarded. In different simulations we used 
different standard deviations, starting with 0% (monodis- 
perse) and increasing in steps of 1.5% until 9%. Simulation 
units are fixed by the average particle diameter, the parti¬ 
cle mass and the temperature (fcT = 1). At the beginning 
of a simulation, we shrunk the particles drastically and 
arranged them on a hexagonal grid, with random veloci¬ 
ties chosen according to a Normal distribution - subject 
to the two constraints of zero total momentum and pre¬ 
scribed total energy. In a first phase of the simulation run, 
the particles, being in a gas phase, were successively grown 
to their individually prescribed target diameter. This was 
done without creating additional collisions, such that the 
total energy and momentum remained unaffected. After 
all diameters were attained, the system was allowed to re¬ 
lax to its thermodynamic equilibrium in a second phase 
of the simulation run. The equilibrium was either a gas 
state, at surface fractions (j) < 0.70, or a crystalline state, 
at (/) > 0.72, see footnote.^ Polydispersity shifted the tran¬ 
sition to higher values: For example, at 6% we found the 
crystalline phase at (j) > 0.78, with disclination defects, 
and at 9% polydispersity we did not find a crystalline 
structure anymore. Finally, the simulation run was con¬ 
tinued, and snapshots were recorded periodically in time. 
The period was chosen such that on average every particle 
collided five times between subsequent snapshots. 

For the pressure, which is at the same time a ther¬ 
modynamic and a mechanical quantity, we have a refer¬ 
ence value at our disposition. During simulation runs, also 
the mechanical definition of pressure, that is the volume- 
averaged linear momentum current was recorded as a time 
average. It comprises a kinetic part and a virial part [37, 
38]. This pressure, which converges very quickly because 
it is calculated from all collisions, will serve as a reference 
in the comparison of the different cavity results. We veri¬ 
fied that this numerical reference pressure coincides with 
the known virial expansion [39,40] and found deviations 
smaller than 0.1% up to </> = 0.35. 

For measuring the volume and boundary of the cav¬ 
ity in all three methods, we implemented the algorithm 
of Ref. [29], see also Refs. [30,31]. We adapted it to the 
periodic case, in two dimensions, for configurations where 

® We neither discuss the hexatic phase here, nor the precise 
location and nature of the phase transitions. 



Figure 2. The Voronoi cells (blue) which served to calculate 
the cavities in fig. la. Dotted circles are the particles, extended 
by the radius of the particle for which the cavities are calcu¬ 
lated. 

every particle can in principle have a different diameter. 
The general idea of the algorithm is to cut space into tri¬ 
angles, such that for each of them only a single disk is to 
be considered. The triangles in question are pieces of the 
Voronoi cells which come from a radical-plane construc¬ 
tion, see fig. 2 for an example. The Voronoi diagram (also 
called power diagram) is the dual of a weighted Delaunay 
triangulation (also called regular triangulation), where the 
weights are the squares of the extended radii [41,42]. For 
the calculation of the regular triangulation, we relied on 
the robust and efficient C-I--I- library CGAL [43,44]. Their 
algorithm scales as N log N in the number of particles (in¬ 
stead of N'^ as in Ref. [31]). 

The periodic version of the regular triangulation in 
CGAL is about to be developed, so that we had to im¬ 
plement this part ourselves, combining what CGAL offers 
for periodic and for regular nonperiodic triangulations [45, 
46]. For the unweighted Delaunay triangulation, it has 
been proved [47] that the periodic triangulation can be 
extracted from 3x3 periodic copies of the initial input. 
We found that this result also holds for weighted Delaunay 
triangulations. 

The source code of the program to calculate the cavi¬ 
ties will be made publicly available [48]. 


3.2 Precision of the methods, pressure 

The result of the data analysis for the pressure, according 
to eqs. (p/av), (p/pv), and (p/AVATO) are plotted as solid 
lines in fig. 3. We used 200 snapshots. Generally, all three 
cavity methods coincide with the reference pressure within 
the linewidth, only the AV average shows extreme errors in 
the crystalline phase. This behaviour is expected because 
snapshots which allow insertion of an (7V-|-I)st particle are 
very rare in the dense phase, resulting in an average over 
a single cavity in the worst case. 

In order to quantify the differences, the top-left insets 
of fig. 3 show relative errors with respect to the reference 
pressure. Generally speaking, the error is less than 1%, 
with largest errors in the dense gaseous phase, close to the 









Michael Schindler, A. C. Maggs: Cavity averages in the presence of polydispersity and incomplete data 


9 



Figure 3. Pressure extracted from molecular-dynamics snap¬ 
shots. From top to bottom, polydispersities are 0%, 1.5%, 3%, 
and 6%. Solid lines: the true radii are used; number of snap¬ 
shots is K = 200. Dotted lines: disks are intentionally treated 
as if they were monodisperse; K = 1700. Vertical gray lines 
are guides to the eye to separate liquid from crystalline states 
(without formal definition). 


transition to more ordered phases. There, the AVATO av¬ 
erage shows slightly larger fluctuations than the FV av¬ 
erage. Far in the crystalline phase the errors of the FV 
and the AVATO averages then drop again to very small val¬ 
ues. The errors of the AV average grow strongly at around 
(p > 0.65, again because the number of extensible snap¬ 
shots decreases strongly. 


3.3 Chemical potentials 


For the chemical potentials, which are not mechanical 
properties, we do not have reference data at our dispo¬ 
sition, and we can only plot the cavity averages. Since a 
truly polydisperse system has as many particle classes as 
particles, instead of all the chemical potentials we plot the 
free enthalpy, also known as Gibbs free energy, 

M 

^ = ( 26 ) 
a=l 

Here, M denotes the number of classes; for 0% polydisper¬ 
sity we have M = 1, otherwise M = N. In order to make 
free enthalpies with different numbers of classes compara¬ 
ble, we had to add a term InM. The result of the data 
analysis according to eqs. (/r/Av), (^/fv), (^/avato-a), 
and (/i/AVATO-B) are plotted as solid lines in fig. 4. We 
used the same 200 snapshots as for the pressure. 

In the figure, we see good agreement between the four 
averages, with an exception of the FV average which shows 
too low energies at moderately small surface fractions, 
^ ^ 0-2- We do not have a good explanation for 

this deviation at the moment, but we observe that it is 
related to the number of classes, thus stems from the de¬ 
nominator in eq. (^/fv). The deviation disappears when 
using M = 1 in the monodisperse case in the top panel of 
fig. 4. It reappeared in a test where we forced M = N on 
the same data. However, the same denominator is found 
also in the AVATO average in eq. (/t/avato-b), where it 
does not cause a deviation. We tested also a bi-disperse 
system and found pronounced fluctuations in the FV aver¬ 
age - to smaller and larger values - just at those values 
of (j) where it exhibits the too low energies in fig. 4. We 
thus conclude that eq. (^/fv) is extremely sensible in this 
0 -region and converges more slowly than elsewhere. 

Furthermore, only one of the four methods, namely 
eq. (/t/avato-a) gives values beyond the gaseous phase. 
The limitation for the three others comes of course again 
from their denominators, which are AV averages and thus 
give no data if the snapshots are inextensible. The fourth 
method, eq. (/t/avato-a) is known to underestimate sys¬ 
tematically the chemical potentials, so that one cannot 
rely on its results. We can check the consistency of pres¬ 
sure and chemical potentials using the Gibbs-Duhem re¬ 
lation Nadfia = Vdp — SdT, where the last term van¬ 
ishes because we worked at constant temperature. For the 
monodisperse case we rewrite the relation in two ways 
which avoid numerical differentiation. 


dp 

dp 


djpp- - p) 
dp 


= 


p, and 


(27) 
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Figure 4. Free enthalpy per particle, extracted from molecular 
dynamics snapshots. From top to bottom, polydispersities are 
0%, 1.5%, 3%, and 6%. Solid lines: the true radii are used; 
number of snapshots is K = 200. Dotted lines: disks are 
intentionally treated as if they were monodisperse; K = 1700. 



Figure 5. Test of the Gibbs-Duhem relation on monodisperse 
data. We used the shortcut /r/fcT instead of G/NkT — In + 
In M. The little triangles indicate the expected local derivatives 
according to eqs. (27). 

where p = N/V denotes the number density. Both variants 
of the equation are plotted in fig. 5. The little triangles in¬ 
dicate the expected derivative, that is the right-hand side 
of the above equations. By eye, these derivatives seem to 
coincide well with the general slope of the curves; only 
in the crystalline part slope and triangles do not agree 
at all. This was expected and proves that the values of 
eq. (/t/avato-a) beyond the gaseous phase are not physi¬ 
cal. 


3.4 Missing radius information 

Above, we have shown that the three averages, AV, FV, 
and AVATO give similar answers when applied to numerical 
data. This is in fact not surprising, since their equivalence 
has been established analytically. The agreement of the 
above data is thus a test for the implementation and for 
the convergence. 

We go now a step further and investigate the robust¬ 
ness of the three averages against imprecise snapshots. We 
will here consider only one source of imprecision, that is 
lack of the individual radii of the disks/spheres. This is 
a typical experimental situation, where one has a global 
idea of the radius standard deviation, but where the res¬ 
olution is insufficient to reliably determine the radius of 
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each particle [18]. The smallest experimental radius stan¬ 
dard deviation is about 4% [14], so our question here is 
whether already this small polydispersity changes the re¬ 
liability of results extracted with one or the other cavity 
method. 

On our numerical data this task can be done quite eas¬ 
ily. It suffices to throw away the information of individual 
radii. The algorithm which determines the cavities works 
as well on freely invented radii as on the true ones. One 
could for example attribute completely new radii accord¬ 
ing to a given distribution, or shuffle the old ones among 
all particles. Generally, we think that the details of radius 
attribution does not play a major role. We therefore follow 
a conceptually simpler route, that is we treat the polydis- 
perse system as if it were monodisperse and attribute a 
unity diameter to every particle. The results for the pres¬ 
sure and for the free enthalpy are plotted as dotted lines 
in figs. 3 and 4. 

For the pressure, we find extremely large deviations 
from the reference pressure, which grow with the surface 
fraction (p. The FV average is the first to deviate, it shows 
errors of 100% already far in the gas phase for p > 0.5. One 
has to keep the polydispersity as tiny as 1.5% to keep the 
errors of the FV average below 20%. The AVATO average 
is more stable, it works up to « 0.7. Remarkably both 
the FV and the AVATO averages are not robust against 
missing radius information in the crystalline phase, not 
even for 1.5% polydispersity - Remember that their errors 
were negligible when the radii were all correctly taken into 
account. The third method, the AV average works best up 
to the point where it fails also for the full radii information. 
Generally speaking, also the fluctuations in the curves is 
larger than in the full-radii case, although we took many 
more snapshots into account. 

For the chemical potentials, fig. 4 shows a complete 
failure of the FV method, even at rather small densities, 
and even at tiny polydispersity. Errors are generally about 
5-10 kT, and the curves fluctuate strongly. The AVATO av¬ 
erage gives the correct result, until it deviates also at high 
densities, p > 0.66. The AV average, again, gives the cor¬ 
rect result in the p range where it works. 

In the search for an explanation why the FV method, 
and to some extend also the AVATO method, are not ro¬ 
bust against missing particle radii, we remark that many 
configuration snapshots are incompatible with a modified 
radius attribution: Particles may overlap, as can be seen 
in the right-hand column of fig. 6. This has severe im¬ 
plications for the cavities, especially for the free volumes. 
Figures 6a’,b’ show cases where a cavity, originally the one 
which was the free volume of the particle, is now missed 
by the particle center. This cavity is not counted in the 
FV average, but it is counted in the AVATO average. The 
smaller a cavity, the more easily is it missed. However, es¬ 
pecially the small cavities contribute strongly to the chem¬ 
ical potentials, where \/V is averaged. In the pressure, 
which averages S/V, small cavities also contribute more 
strongly than large ones, but the scaling is weaker. For 
very small cavities, a second effect occurs, they may com¬ 
pletely disappear, as shown in last row of fig. 6. In total. 



Figure 6. Examples of what happens to the free-volume cav¬ 
ity in case of missing radius information. Left with the true 
radii, right with artificially identical radii, (a’, b’) The par¬ 
ticle center is not in “the” free-volume cavity, (c’) The cavity 
has vanished completely. 

it is qualitatively understandable why we see more errors 
in dense than in dilute systems, and why the errors are 
larger in chemical potentials than in pressures, and why 
FV averages are more affected than the others. 

We tested a number of variants of the original algo¬ 
rithms. Concerning missed free volumes, we relaxed the 
criterion a bit and counted as “free volume” also cavities 
with a distance up to 0.05 from the particle center. The 
cavities in figs. 6a’,b’ would have been counted as “free 
volumes”. As another variant, we modified the normali¬ 
sation of the averages. In the original algorithm, it reads 
{f)s{N) = jiJ2keS(N)f(^)- Instead of the total number 
of snapshots, K, we now divided by the number of snap¬ 
shots which gave a non-vanishing /, thus discarding those 
where cavities may have been missed. Finally, we modified 
the number of particle “classes” (M) for the chemical po¬ 
tentials, grouping the particles with similar radii together. 
None of the above variants, nor combinations of them im¬ 
proved the dotted curves in figs. 3 and 4. 

4 Conclusions 

We investigated three geometrical methods to extract the 
pressure and the chemical potentials from hard-sphere po- 
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sitions. All three methods are cavity-based averages and 
differ in the precise way how the average is done and what 
quantity is averaged. The theoretical derivation of these 
methods all start with an average over all possibilities to 
place (A^—1) particles in a given volume. The FV and 
AVATO methods consistently turn this average into one 
over all possibilities to place N particles in the volume, 
whereas the AV average stays with (A^—1) particles. The 
differences in how the average is done and what quantity 
is averaged implies that the three methods differ in their 
applicability to real-world data sets, that is in their con¬ 
vergence rate, and in their response to noise. 

As could be expected, there are hardly any problems 
in dilute, gas-like systems. Such systems easily explore 
their phase space, and it is easy to insert another parti¬ 
cle. Consequently, there is negligible difference in averages 
over N and over (A^—1) particles, and these averages tend 
to converge well. We encountered, however, an exception 
to this general trend, namely the combination FV aver¬ 
age/chemical potentials in fig. 4. With the same excep¬ 
tion, we found that the methods are robust to variations 
in the particle radii in the dilute regime, say for (j) < 0.5. 
This conclusion is in agreement with a previous experimen¬ 
tal/numerical study [49] done in three dimensions. In this 
reference, the AV average was calculated by counting pix¬ 
els of the data snapshots, a procedure which introduced 
errors of at least 5-10% in the individual particle radii, the 
system being minimally polydisperse. Further, the precise 
position of the particles was limited by the Brownian dif¬ 
fusion time scale. It appears that in the dilute regime, the 
AV average - together with the additional data treatment 
described in the reference - was nevertheless able to give 
correct results for pressure and chemical potential. 

Problems really start when systems become dense. At 
very high densities, say (j) > 0.75, exploring the phase 
space takes a forbiddingly long time, which, together with 
the uncertainty whether an average over N and over (A^—1) 
particles are equivalent, renders the AV average unusable. 
Here, the FV and AVATO methods propose a valuable al¬ 
ternative - for calculating the pressure. Indeed, we found 
excellent agreement of these two methods with the refer¬ 
ence pressure in fig. 3. The chemical potentials, however, 
are still out of reach, for the same reasons as for the AV av¬ 
erage. 

Between dilute and very dense, there is a dense, but not 
extremely dense regime, say 0.5 < < 0.75. Depending on 

polydispersity and initial preparation, such systems can be 
either (rather dilute) crystalline, or disordered glass-like, 
or disordered with a very long relaxation time for global 
order. These systems are in the main focus of many exper¬ 
imental studies on colloids. In particular, one would like 
to really compare the “free energies” of crystalline and dis¬ 
ordered systems, and one thus requires both the pressure 
and the chemical potentials. The question is of course, how 
far can we push the cavity-based methods? Are they suit¬ 
able to provide these informations? Despite the fact that 
this question is not directly the subject of the present pa¬ 
per, we think that we provide some elements to its possi¬ 
ble answer. First, a close look on figs. 3 and 4 reveals that 


the AV method can indeed give results close to the crys¬ 
talline phase, or even within. Its applicability thus has not 
a strict boundary but rather gradually decreases together 
with the uncertainty whether an average over N and over 
(A^—1) particles are equivalent. With the AVATO average, 
we here provide a second method which is at least concep¬ 
tually consistent. Comparison of FV and AVATO results 
therefore reveals possible problems with noise. The most 
direct information of our study on the above question is 
the importance of high precision: An experimental method 
shall provide individual radii with an error below 1% in 
order to allow proper extraction of thermodynamic infor¬ 
mation on dense systems (see fig. 3) - a challenging, but 
maybe not unreachable task. 

Surely, every experiment has different sources of noise 
which are more or less pertinent. We here provide an exam¬ 
ple for only one source of noise, or rather of missing infor¬ 
mation, which was motivated by an existing experimental 
situation [50]. Experimentalists who are interested in cali¬ 
brating their data analysis are invited to use our cavity al¬ 
gorithm, which will be made available as open source [48]. 
For those who do not want to undertake heavy calibra¬ 
tion, we propose the following rules of thumb: (i) Avoid 
the FV average. It is generally less robust against wrong ra¬ 
dius information than the other two methods. It has prob¬ 
lems with chemical potentials at moderately low densities, 
(ii) If the density is below f < 0.5, AVATO averages and 
AV averages give the same result. AVATO averages are con¬ 
ceptually cleaner, but depending on the number of classes, 
AV averages might be less expensive to calculate, (iii) If 
your system is dense disordered or crystalline, use more 
than one method and compare, (iv) If you want to imple¬ 
ment only one method, stay with AVATO. 

As a final remark, we would like to say that the algo¬ 
rithm is not restricted to thermal systems. The physical 
interpretation, however, starts from the assumption that 
all configurations are realised with the same probability. 
It could therefore be interesting to apply the algorithms 
also to systems which are not at equilibrium and where 
equivalence of all ensembles is not guaranteed [51]. 

We thank Daniel Bonn and Rojman Zargar for interesting us 
for the free-volume methods. 
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